Caterpillars, context, tree automata and tree pattern matching

نویسندگان

  • Anne Brüggemann-Klein
  • Derick Wood
چکیده

We present a novel, yet simple, technique for the speciication of context in struc-tured documents that we call caterpillar expressions. Although we are primarily applying this technique in the speciication of context-dependent style sheets for HTML, SGML and XML documents, it can also be used for query speciication for structured documents, as we shall demonstrate, and for the speciication of computer program transformations. From a conceptual point of view, structured documents are trees, and one of the oldest and best-established techniques to process trees and, hence, structured documents are tree automata. We present a number of theoretical results that allow us to compare the expressive power of caterpillar expressions and caterpillar au-tomata, their companions, to the expressive power of tree automata. In particular, we demonstrate that each caterpillar expression describes a regular tree language that is, hence, recognizable by a tree automaton. Finally, we employ caterpillar expressions for tree pattern matching. We demonstrate that caterpillar automata are able to solve tree-pattern-matching problems for some, but not all, types of tree inclusion that Kilpell ainen investigated in his PhD thesis. In simulating tree pattern matching with caterpillar automata, we reprove some of Kilpell ainen's results in a uniform framework.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Caterpillars: A Context Specification Technique

We present a novel, yet simple, technique for the speciication of context in structured documents that we call caterpillar expressions. Although we are primarily applying this technique in the speciication of context-dependent style sheets for HTML, SGML and XML documents, it can also be used for query speciica-tion for structured documents, as we shall demonstrate, and for the speciication of ...

متن کامل

Multidimensional fuzzy finite tree automata

This paper introduces the notion of multidimensional fuzzy finite tree automata (MFFTA) and investigates its closure properties from the area of automata and language theory. MFFTA are a superclass of fuzzy tree automata whose behavior is generalized to adapt to multidimensional fuzzy sets. An MFFTA recognizes a multidimensional fuzzy tree language which is a regular tree language so that for e...

متن کامل

An Automata-Based Approach to Pattern Matching

Due to its importance in security, syntax analysis has found usage in many high-level programming languages. The Lisp language has its share of operations for evaluating regular expressions, but native parsing of Lisp code in this way is unsupported. Matching on lists requires a significantly more complicated model, with a different programmatic approach than that of string matching. This work ...

متن کامل

A missing link in root-to-frontier tree pattern matching

Tree pattern matching (tpm) algorithms play an important role in practical applications such as compilers and XML document validation. Many tpm algorithms based on tree automata have appeared in the literature. For reasons of efficiency, these automata are preferably deterministic. Deterministic root-to-frontier tree automata (drftas) are less powerful than nondeterministic ones, and no root-to...

متن کامل

Optimal Left-to-Right Pattern-Matching Automata

We propose a practical technique to compile pattern-matching for prioritised overlapping patterns in equational languages into a minimal, deterministic, left-toright, matching automaton. First, we present a method for constructing a tree matching automaton for such patterns. This allows pattern-matching to be performed without any backtracking. Space requirements are reduced by using a directed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999